Online-Academy
Look, Read, Understand, Apply

Data Mining And Data Warehousing

Reconstruction-based Outlier Detection using Auto Encoder

Reconstruction-based Outlier Detection using Autoencoders

  • Idea of Reconstruction-based Methods
  • The assumptions made are as follows:
  • Normal data has regular patterns that a model can learn. Outliers deviate from these patterns and are harder to reconstruct.

    So, if we train a model to reconstruct input data, the reconstruction error will be low for normal points and high for outliers.

    An autoencoder is a type of neural network with two parts:

    1. Encoder: Compresses input x into a lower-dimensional latent space z.
    2. Decoder: Reconstructs the original input x_mean from z.
    3. Trained to minimize reconstruction loss, usually Mean Squared Error (MSE): (x - x_mean)2.

    Using it for Outlier Detection

    1. Train the autoencoder on normal data (without outliers).
    2. For any new data point, p:
    3. Pass p it through the autoencoder.
    4. Compute reconstruction error, E = (x - x_mean)2
    5. If error E > threshold --> label p as outlier.